4 research outputs found
MEGA: Multilingual Evaluation of Generative AI
Generative AI models have shown impressive performance on many Natural
Language Processing tasks such as language understanding, reasoning, and
language generation. An important question being asked by the AI community
today is about the capabilities and limits of these models, and it is clear
that evaluating generative AI is very challenging. Most studies on generative
LLMs have been restricted to English and it is unclear how capable these models
are at understanding and generating text in other languages. We present the
first comprehensive benchmarking of generative LLMs - MEGA, which evaluates
models on standard NLP benchmarks, covering 16 NLP datasets across 70
typologically diverse languages. We compare the performance of generative LLMs
including Chat-GPT and GPT-4 to State of the Art (SOTA) non-autoregressive
models on these tasks to determine how well generative models perform compared
to the previous generation of LLMs. We present a thorough analysis of the
performance of models across languages and tasks and discuss challenges in
improving the performance of generative LLMs on low-resource languages. We
create a framework for evaluating generative LLMs in the multilingual setting
and provide directions for future progress in the field.Comment: EMNLP 202
Severe communication delays are independent of seizure burden and persist despite contemporary treatments in SCN1A + Dravet syndrome: Insights from the ENVISION natural history study
Objective: Dravet syndrome (DS) is a developmental and epileptic encephalopathy characterized by high seizure burden, treatment‐resistant epilepsy, and developmental stagnation. Family members rate communication deficits among the most impactful disease manifestations. We evaluated seizure burden and language/communication development in children with DS.
Methods: ENVISION was a prospective, observational study evaluating children with DS associated with SCN1A pathogenic variants (SCN1A+ DS) enrolled at age ≤5 years. Seizure burden and antiseizure medications were assessed every 3 months and communication and language every 6 months with the Bayley Scales of Infant and Toddler Development 3rd edition and the parent‐reported Vineland Adaptive Behavior Scales 3rd edition. We report data from the first year of observation, including analyses stratified by age at Baseline: 0:6–2:0 years:months (Y:M; youngest), 2:1–3:6 Y:M (middle), and 3:7–5:0 Y:M (oldest).
Results: Between December 2020 and March 2023, 58 children with DS enrolled at 16 sites internationally. Median follow‐up was 17.5 months (range = .0–24.0), with 54 of 58 (93.1%) followed for at least 6 months and 51 of 58 (87.9%) for 12 months. Monthly countable seizure frequency (MCSF) increased with age (median [minimum–maximum] = 1.0 in the youngest [1.0–70.0] and middle [1.0–242.0] age groups and 4.5 [.0–2647.0] in the oldest age group), and remained high, despite use of currently approved antiseizure medications. Language/communication delays were observed early, and developmental stagnation occurred after age 2 years with both instruments. In predictive modeling, chronologic age was the only significant covariate of seizure frequency (effect size = .52, p = .024). MCSF, number of antiseizure medications, age at first seizure, and convulsive status epilepticus were not predictors of language/communication raw scores.
Significance: In infants and young children with SCN1A+ DS, language/communication delay and stagnation were independent of seizure burden. Our findings emphasize that the optimal therapeutic window to prevent language/communication delay is before 3 years of age